-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support providing multi config files #336
Conversation
@jcblemai i addressed our in person discussion issues:
|
Righto, the CI issues appear to be passing now. There's a lot of faff going on with passing around config. Troubleshooting the problems here hit against that design decision several times. |
d00b97a
to
200a950
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good start, left comments on the big things, but I certainly missed smaller things. The big points I want to emphasize are:
- It doesn't look like there were any unit tests added for any of the new functionality which would be really helpful for maintaining the new features over time,
- There is a general lack of documentation/type hints which would be helpful in understanding what the new functionality does and allowing others to integrate it into their work,
- Lots of small stylistic changes, but importantly ones that help clean up the diff would be helpful for reviewing,
- A formatter like black is recommended for new files, and
- It's not clear to me that these changes are backwards compatible, if anything the changes to
examples/test_cli.py
suggest that they are not, which I think would create a large hurdle for operations. Is there a way to make these changes backwards compatible? Or if they are backwards compatible can the changes toexamples/test_cli.py
reflect that by providing the config both ways for all of the fixtures?
# Guidance for extending the CLI: | ||
# - to add a new small command to the CLI, can just add a new function with the @cli.command() decorator here (e.g. patch below) | ||
# - to add something with lots of module logic in it, should define that in the module (e.g. compartments for a group command, or simulate for a single command) | ||
# - ... and then import that module here to add it to the CLI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably go into the wiki instead of inline in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe - I think it's useful to have at hand when editing, though I agree we should be updating a contribution / style guide as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think putting in the wiki would be preferable, it would provide more space to give commentary and even work through a small example.
@@ -142,7 +142,7 @@ def __init__( | |||
|
|||
# SEIR modifiers | |||
self.npi_config_seir = None | |||
if config["seir_modifiers"].exists(): | |||
if config["seir_modifiers"].exists() and self.seir_modifiers_scenario is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the need that caused this change (as well as the similar one for outcome modifiers)? Is there a corresponding unit test to go with this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem here is passing both the config and the scenario. These are merged at the CLI boundary, but then internally, the whole config is still passed around and subset by X_modifiers_scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this is definitely a get-this-working change - there needs to be a more thorough overall here, which would obviate this problem)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation, type hints, and __all__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might just prune this out, since I don't use it (@jcblemai didn't want to mark -c
deprecated). That said, it would be useful for deprecated options to the CLI - what do folks think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this isn't actually used then delete it. If we need it in the future we can dig it up again, but I'd rather not maintain something that isn't used.
examples/test_cli.py
Outdated
result = runner.invoke(simulate, ['-c', 'config_sample_2pop.yml']) | ||
result = runner.invoke(simulate, ['config_sample_2pop.yml']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the changes backwards compatible? I don't think that was entirely clear to me, and this seems to indicate that it is not. If it is, can this test be parameterized such that it hits both cases to prove backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are - note line 33 below. I changed these to demonstrate the new capabilities, and left the last one to demo backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've also added a test demonstrating the re-route of gempyor-simulate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It demonstrates the new capabilities, but I don't think it demonstrates backwards compatibility. I think ideally these tests would be parameterized such that they're invoked via the old method and the new method for each fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Let's chat later today about the right testing regime here - I don't think we should need to reach all the way into execution to confirm that the resulting configs / arguments passed are identical. Might warrant a bit of re-architecture / reframing of tests generally to do that, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should need to reach all the way into execution to confirm that the resulting configs / arguments passed are identical.
Maybe, I think I would be more swayed in that not being needed if there were other unit tests to accompany this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree - but I also think it's going to entail re-orienting the click approach a bit. Right now, parse happens in the commands, rather than in the main group, then dispatched down.
I think its definitely preferrable to centralize this even more (i.e. not just in one function, but into one call of that function), but that's maybe a more radical overhaul of the function signatures.
Actually, looking again more closely I think it is backwards compatible, but I think my confusion at first glance highlights the need to cover both cases a bit more in that particular file as well as add documentation generally. |
Ready for re-review
I added items to test_cli - do you mean for the internal functions, like the option / config merger?
Added some of this, but could use some more perspective on what else is needed.
Did these.
Applied it.
It should be backwards compatible; I'll have a think about how best to include all the ways. Will mean re-assembling the full configs, I guess. |
Re still failing unit test case - not sure what the deal there is. Works locally? |
Haven't gone through this in full detail, but re: the tutorials, I would lean towards leaving the current configs as are so they are standalone readable, and including an example where you patch two configs together. |
Reverted back to the full ymls - where do the examples live? Happy to add there. |
|
Alright - so the latest is good then? Happy to modify any existing examples, but it's hard to know which are relevant / where they are. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The improvements look good, but I think we're still missing documentation and unit tests which are essential to maintaining this functionality over time and covering edge cases now. Here are some PRs with good test/documentation changes to use as examples:
- Document/Test
gempyor.utils.read_df/write_df
#247 - Document And Test
gempyor.file_paths
#250 - Update utils.py #260
- Document/Unit Test
gempyor.parameters
#277 - Document/Unit Test
gempyor.statistics
#304
We're pushing toward the Google style guide for documentation, and expanded set of examples can be found here provided by Napoleon.
And on the unit testing front I think at a minimum unit tests for cli
, option_config_files
, and parse_config_files
need to be implemented. See pytest's getting started guide or the PRs above for examples.
examples/test_cli.py
Outdated
result = runner.invoke(simulate, ['-c', 'config_sample_2pop.yml']) | ||
result = runner.invoke(simulate, ['config_sample_2pop.yml']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It demonstrates the new capabilities, but I don't think it demonstrates backwards compatibility. I think ideally these tests would be parameterized such that they're invoked via the old method and the new method for each fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the .yml
extension be added to the file name for syntax highlighting? This looks like valid YAML on its own. And same for the other .part
files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I'll pull the part
indicator into the file name.
""" | ||
An internal module to share common CLI elements. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module docstrings must go at the top.
argument_config_files = click.argument( | ||
"config_files", nargs=-1, type=click.Path(exists=True) | ||
) | ||
""" `click` Argument decorator to handle configuration file(s) """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python does not support docstrings for variables, can just make these comments.
config_files: list[str], | ||
config_filepath: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a str
, is a pathlib.Path
. At least that's what the corresponding click argument/option suggests, you could coerce to a string before providing to this function.
first_sim_index: int, | ||
stoch_traj_flag: bool, | ||
) -> None: | ||
"""Parse the configuration file(s) and override with command line arguments""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what is the right way to avoid double work here? The click options are in my mind the right place to have the documentation, and this def is not intended to public interface. I don't want us to get into double maintaining / inconsistency between the allowed options / arguments to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
along those lines: I propose a separate issue/PR here for a bit more focused refactor here to make options and such be a bit more dynamic. Swap to a kwargs-y approach, make the options list defined earlier in the file be named and use those names in the loops.
Does that seem reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the way to avoid double work here is to put the documentation with this function. That's fine that they're not intended for the public interface (we should start denoting internal with a leading underscore if we're going to take care to define an external vs internal interface at this stage), but that doesn't exclude it from requiring documentation. We still need to know how to use the function internally when building out future CLI interfaces.
I'm not sure I follow the second comment that well. Why would we move to kwargs for parse_config_files
if we're providing them every time (at least from what I see in current usage) anyways, do we anticipate only providing a few of these args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if parse_config_files
is kwargs-driven, then effectively the arguments can be driven by the names set in the arguments/options objects. Then whatever we do with those options in the future (and maybe that other people do with options for plugins), the parse function "just works". that clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you mean using **kwargs
. Sure, that could be done now or in a future PR doesn't matter too much either way, but it doesn't alleviate the need for documentation. One downside of **kwargs
is how to construct a valid call to this function becomes a bit more ambiguous since the valid arguments are not enumerated in the function definition itself. That could be addressed by well maintained documentation though.
Summarizing conversation @TimothyWillard and I had:
|
alright @TimothyWillard - how's the rough in strike you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still missing unit tests for parse_config_files
. I'm a bit concerned about:
- How the documentation for
parse_config_files
is formatted as well as the parameters being given. I thought the take away was to use**kwargs
rather than specify exact args and then list what fields it would look for in aNotes:
section or similar. The current documentation is oddly formatted and not super helpful from a developer prospective. - I'm also a bit confused about the
--help
view of the commands, for example the simulate command's output has expanded with a lot of extra info that isn't helpful, I think coming from theclick_helpstring
decorator, but on the other hand theflepimop compartments
subcommands help pages don't list any of the options/arguments? I think ideally we want a consistent display with the acceptable options/arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few minor comments, but overall I'm excited about this PR. I think this sets the stage for a very simple and cohesive CLI interface. The unit testing also makes this maintainable over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So does this mean we can delete examples/test_cli.py
then? As far as correcting the CI goes it would just require removing the "Run gempyor-cli integration tests from examples" step from .github/workflows/gempyor-ci.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue yes. @jcblemai any reason to keep the original version of this around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should still have code that runs the actual examples in the example folder, but we should rename it to test_examples perhaps ?
flepimop/gempyor_pkg/tests/shared_cli/test_parse_config_files.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Timothy Willard <[email protected]>
1a8b6c7
to
daebcdc
Compare
|
||
You may provide an arbitrary number of separate configuration files to combine to create a complete configuration. | ||
|
||
At this time, only `simulate` supports multiple configuration files. Also, the patching operation is fairly crude: configuration options override previous ones completely, though with a warning. The files provided from left to right are from lowest priority (i.e. for the first file, only options specified in no other files are used) to highest priority (i.e. for the last file, its options override any other specification). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some examples would be informative to show the limitations. This could lead to very unexpected behaviour and since we have some external users, i think it would be useful to be explicit here about what this cannot do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will amend this - how about addressing, say, new seir scenarios?
I think in the future we can add flags that dictate this behavior. Say a |
Describe your changes.
This PR introduces support for using multiple configuration files. Briefly, commands like
will now (and going forward, more easily) support
n.b. the
-c
flag is no longer required for specifying config files.This PR adds the
patch
command:flepimop patch config1.yml config2.yml > config_combined.yml
which will produce the monolithic version of config file(s). Also, the specific options (e.g. write-csv, run_id, ...) are now also incorporated into the config;
patch
will capture those options and incorporate them into the resulting config file.This PR deprecates
gempyor-simulate ...
, in favor offlepimop simulate ...
;gempyor-simulate ...
still works, but issues a deprecation warning. The-c
option is not yet deprecated by this PR, but we anticipate it will be in the future.Documentation Updates
This PR includes new example partial YAML configurations. The gitbook documentation is updated to replace
gempyor-simulate
withflepimop simulation
and removes the-c
from those examples. There is also a new gitbook documentation section added explaining the use of multiple configuration files.